Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/global context #6537

Merged
merged 11 commits into from
Dec 18, 2017
Merged

Conversation

dzhwinter
Copy link
Contributor

fix #6536

// delete device_context;
// }
// }
// }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need Executor::Executor(const platform::DeviceContext& device) for recurrent_op.
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/recurrent_op.cc#L236

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Executor::Executor(const platform::DeviceContext& device) should be removed in whole project.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Executor(places) no longer owns the DeviceContext, I believe you are right.

class DeviceContextPool {
public:
static DeviceContextPool& Get() {
PADDLE_ENFORCE_NOT_NULL(pool, "Need to Create DeviceContextPool first!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can create if nullptr here. So the Create function is not neded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a singleton ONLY always mix these two function together, but here we have a slightly problem.

Create will create the all the devices. But when we need get always only need create part of it.
For example, Init function create all the available GPU, CPU devices, when we want to create two or more executors, then Get() happens part may not have all the device information.

In previous implementation, we definitly have one and only one executor, but now it is different since executor does not contain resources, we may have two or more executors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense.

}
// TODO(dzhwinter) : assign first find device. Will enhanced later.
borrowed_contexts.emplace_back(range.first->second);
device_tasks_[place] += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a reference count? See no place to decrease it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it is useless. We will improve it after migrating to DevicePool for GPU.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then add some TODO comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ASSERT_EQ(InitDevices(ds1), true);

#ifdef PADDLE_WITH_CUDA
std::vector<std::string> ds2 = {"CPU", "GPU:0", "GPU:1"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document the format of place strings, or add a TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is documented in the init.cc function definition part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -132,8 +112,8 @@ void Executor::Run(const ProgramDescBind& pdesc, Scope* scope, int block_id,
}
}

Executor::Executor(const platform::DeviceContext& device)
: device_contexts_({&device}), own_(false) {}
// Executor::Executor(const platform::DeviceContext& device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why comment these lines? I think this ctor is still useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface should be removed, because we only need the place indicator. I will rewrite how the while_op, recurrent_op using executor in next PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

Before rewriting ops that need to run executor, please take a look at #6664

explicit Executor(const std::vector<platform::Place>& places);
explicit Executor(const platform::DeviceContext& devices);
~Executor();
// ~Executor();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete the commented code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

// device format
// CPU
// GPU:1
// FPGA:2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No FPGA in the following code, so maybe we can delete this comment firstly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

ASSERT_EQ(InitDevices(ds1), true);

#ifdef PADDLE_WITH_CUDA
std::vector<std::string> ds2 = {"CPU", "GPU:0", "GPU:1"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the CPU means all CPU cores? do we need a approach to specify the number of CPU cores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only means one CPUDeviceContext, our multi-CPU design is not complete at the current stage, so I think that specify the number of CPU cores is useless.

Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

@dzhwinter dzhwinter dismissed tonyyang-svail’s stale review December 18, 2017 05:01

Executor(places) no longer owns the DeviceContext, I will fix DeviceContext creation inside op in next PR.

@dzhwinter dzhwinter merged commit 24fda39 into PaddlePaddle:develop Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a pool of DeviceContext in init stage
4 participants